1. EasyMultiProfiler
EasyMultiProfiler
package adopts an object-oriented programming method, integrates multiple data analysis modules, and realizes smooth data transfer between different modules, thus standardizing and clarifying the overall analysis process. EasyMultiProfiler
not only includes the current mainstream data pre-processing methods but also has powerful analysis capabilities. It covers a variety of mainstream analysis methods (including diversity analysis, cluster analysis, correlation analysis, difference analysis, dimensionality reduction analysis, etc.), and also supports GSEA analysis, WGCNA analysis, and multi-omics analysis. Users can quickly explore data features and patterns and visually interpret analysis results in a variety of visualizations, including box plot, scatter plot, enriched dot plot, enriched network plot, bivariate plot, enriched curve plot, heatmap, sankey plot, and structural histogram. All in all, EasyMultiProfiler
provides users with a one-stop data processing, analysis, and visualization service, greatly simplifying intermediate processes, and is logically clear and very beginner-friendly.
1.1 EasyMultiProfiler Installation
R version must be above 4.3.3. (IMPORTANT!!)
1.1.1 First installation
(1) Mac os users
Option 1 (Recommended) :
Step 1 Install the gfortran Compiler Environment
Many R packages require a compiler environment during installation. To ensure maximum compatibility, it is recommended that new users install the following version of gfortran rather than the latest version.
Download link: Official Source
Step 2 Install Using pak
if (!requireNamespace("pak", quietly=TRUE)) install.packages("pak")
pak::pak("liubingdong/EasyMultiProfiler")
library(EasyMultiProfiler)
Option 2 :
Step 1 Install clusterProfiler
if (!requireNamespace("BiocManager", quietly=TRUE)) install.packages("BiocManager")
BiocManager::install("clusterProfiler")
library(clusterProfiler)
Step 2 Install Using remotes
if (!requireNamespace("remotes", quietly=TRUE)) install.packages("remotes")
remotes::install_github("liubingdong/EasyMultiProfiler")
library(EasyMultiProfiler)
(2) Windows users
Step1 Install RTool
- According to the installed R version, install the corresponding version of RTool (e.g. R 4.3.x requires RTool 4.3, R 4.4.x requires RTool 4.4).
- RTool download address 1: official original address; download address 2: Chinese mirror address.
- RTool installation steps: On the download page, click
[Download R for Windows]
→[Rtools]
)
Step2 After successfully installing the corresponding version of RTool, please restart R.
Step3 Install the EasyMultiProfiler package using the pak package.
if (!requireNamespace("pak", quietly=TRUE)) install.packages("pak")
pak::pak("liubingdong/EasyMultiProfiler")
library(EasyMultiProfiler)
1.1.2 Update version
EasyMultiProfiler
is constantly updating more analysis and visualization modules, and when an update is needed, it can be updated by running the pak package again.
pak::pak("liubingdong/EasyMultiProfiler")
library(EasyMultiProfiler)
1.2 EasyMultiProfiler Basics
1.2.1 Data Object Container for EasyMultiProfiler
MAE
object: namelyMultiAssayExperiment
.MAE
object is an object that users need to create initially. It is usually composed of assay and corresponding coldata and rowdata of multiple omics projects (see Section 1.2.2 for relevant definitions).MAE
object of single omics project can also be created.EMPT
object: namelyEMP-transporter
.EMPT
object is extracted fromMAE
object.EMPT
object inheritsSummarizedExperiment
data storage object, mainly includingassay
,rowdata
,coldata
, and parameters required for various process analysis.EMP
object: When multipleEMPT
objects are analyzed jointly, they can be merged into oneEMP
object.
① In order to ensure the original data is not tampered with as much as possible, the process from MAE to EMPT during data analysis is unidirectional, meaning that an EMPT object cannot be put back into an MAE object.
②If you need to modify the EMPT object within the MAE object, simply adjust the EMPT object, and reassemble it into the MAE object.
③When conducting joint analysis, the EMPT objects can be directly combined using the + symbol to form EMP objects, thus facilitating the process of multi-omics analysis.
④When the names of the projects involved in the combination are the same, the name of the subsequent project will be automatically modified to ensure the uniqueness of the project names in the EMPT object.

1.2.2 Data Composition of EasyMultiProfiler
Each project data stored in the MAE
object contains the following three contents: assay
、coldata
、rowdata
. assay
refers to the experimental data of the project; coldata
refers to the sample-related data; and rowdata
refers to the feature-related data.
Assay | Coldata | Rowdata | |
---|---|---|---|
Column | Feature | Sample related data | Feature related data |
Row | Sample ID | Sample ID | Feature |

1.2.3 Data Presentation of EasyMultiProfiler
EasyMultiProfiler
package uses tibble
format (a data frame type) to present data, where under the tibble
column names, users can view the corresponding column data types. Common data types include: int - integer type, dbl - floating point type, chr - string type, lgl - logical type (True and False), fct - factor type, date - date type, dttm - datetime type.
1.2.4 Data Transfer of EasyMultiProfiler
EasyMultiProfiler
package uses |>
operator to facilitate data transfer, where the output of the previous module serves as the input for the next module (similar to the function of the %>%
operator in package Magrittr
), thus enabling sequential data processing operations. Because EMPT
objects are passed through the data analysis process, there is little mandatory analysis order throughout the process. Users are free to explore data analysis in individual modules without worrying about matching data delivery.
1.2.5 Action Parameters of EasyMultiProfiler
When the EasyMultiProfiler
package performs data analysis, project data is stored in EMPT
object and passed between functional modules. After the user completes the analysis of a module, the parameter action
can be used to extract the original analysis results generated by the module. The default value of this parameter is action='add'
, which means that the analysis result of the current module will be added to the EMPT
object for transmission; action='get'
, which means that the analysis result of the current module will be directly output.
The parameter action in the module EMP_filter is used to specify whether the module operation is to kick or select.
1.2.6 Caching Mechanism of EasyMultiProfiler
The main analysis modules of EasyMultiProfiler
package have a cache snapshot mechanism, that is, when the same data is used for analysis repeatedly, the analysis process will automatically extract the corresponding data analysis results from the cache according to the hash value, so as to avoid double calculation. This caching feature greatly improves the convenience of the analysis process and avoids the need to create multiple intermediate variables in traditional data analysis to store complex computing results. Users only need to pass the current calculation results through the |>
operator to successfully complete each analysis step, making the entire analysis process clearer and more concise.
① The caching mechanism takes into account the differences between data, parameters and modules, and eliminates the possible data conflicts.
② If you need to recalculate, you can specify the parameter use_cached=FALSE to cancel the cache of this module.
🏷️Example: The first calculation takes about 7 seconds. When it was calculated again, it took only 0.004 seconds.
data(MAE)
## First run
system.time({
MAE |>
EMP_assay_extract('geno_ec') |>
EMP_WGCNA_cluster_analysis(RsquaredCut = 0.85)
})
## Second run
system.time({
MAE |>
EMP_assay_extract('geno_ec') |>
EMP_WGCNA_cluster_analysis(RsquaredCut = 0.85)
})
## Close the cached
system.time({
MAE |>
EMP_assay_extract('geno_ec') |>
EMP_WGCNA_cluster_analysis(RsquaredCut = 0.85,use_cached = FALSE)
})